43 research outputs found
Handwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach
In this work, a novel deep learning technique for the recognition of
handwritten Bangla isolated compound character is presented and a new benchmark
of recognition accuracy on the CMATERdb 3.1.3.3 dataset is reported. Greedy
layer wise training of Deep Neural Network has helped to make significant
strides in various pattern recognition problems. We employ layerwise training
to Deep Convolutional Neural Networks (DCNN) in a supervised fashion and
augment the training process with the RMSProp algorithm to achieve faster
convergence. We compare results with those obtained from standard shallow
learning methods with predefined features, as well as standard DCNNs.
Supervised layerwise trained DCNNs are found to outperform standard shallow
learning models such as Support Vector Machines as well as regular DCNNs of
similar architecture by achieving error rate of 9.67% thereby setting a new
benchmark on the CMATERdb 3.1.3.3 with recognition accuracy of 90.33%,
representing an improvement of nearly 10%
A two-pass fuzzy-geno approach to pattern classification
The work presents an extension of the fuzzy approach to 2-D shape recognition
[1] through refinement of initial or coarse classification decisions under a
two pass approach. In this approach, an unknown pattern is classified by
refining possible classification decisions obtained through coarse
classification of the same. To build a fuzzy model of a pattern class
horizontal and vertical fuzzy partitions on the sample images of the class are
optimized using genetic algorithm. To make coarse classification decisions
about an unknown pattern, the fuzzy representation of the pattern is compared
with models of all pattern classes through a specially designed similarity
measure. Coarse classification decisions are refined in the second pass to
obtain the final classification decision of the unknown pattern. To do so,
optimized horizontal and vertical fuzzy partitions are again created on certain
regions of the image frame, specific to each group of similar type of pattern
classes. It is observed through experiments that the technique improves the
overall recognition rate from 86.2%, in the first pass, to 90.4% after the
second pass, with 500 training samples of handwritten digits
Segmentation of Offline Handwritten Bengali Script
Character segmentation has long been one of the most critical areas of
optical character recognition process. Through this operation, an image of a
sequence of characters, which may be connected in some cases, is decomposed
into sub-images of individual alphabetic symbols. In this paper, segmentation
of cursive handwritten script of world's fourth popular language, Bengali, is
considered. Unlike English script, Bengali handwritten characters and its
components often encircle the main character, making the conventional
segmentation methodologies inapplicable. Experimental results, using the
proposed segmentation technique, on sample cursive handwritten data containing
218 ideal segmentation points show a success rate of 97.7%. Further
feature-analysis on these segments may lead to actual recognition of
handwritten cursive Bengali script.Comment: Proceedings of 28th IEEE ACE, pp. 171-174, December 2002, Science
City, Kolkat
Classification of Log-Polar-Visual Eigenfaces using Multilayer Perceptron
In this paper we present a simple novel approach to tackle the challenges of
scaling and rotation of face images in face recognition. The proposed approach
registers the training and testing visual face images by log-polar
transformation, which is capable to handle complicacies introduced by scaling
and rotation. Log-polar images are projected into eigenspace and finally
classified using an improved multi-layer perceptron. In the experiments we have
used ORL face database and Object Tracking and Classification Beyond Visible
Spectrum (OTCBVS) database for visual face images. Experimental results show
that the proposed approach significantly improves the recognition performances
from visual to log-polar-visual face images. In case of ORL face database,
recognition rate for visual face images is 89.5% and that is increased to 97.5%
for log-polar-visual face images whereas for OTCBVS face database recognition
rate for visual images is 87.84% and 96.36% for log-polar-visual face images
An adaptive block based integrated LDP,GLCM,and Morphological features for Face Recognition
This paper proposes a technique for automatic face recognition using
integrated multiple feature sets extracted from the significant blocks of a
gradient image. We discuss about the use of novel morphological, local
directional pattern (LDP) and gray-level co-occurrence matrix GLCM based
feature extraction technique to recognize human faces. Firstly, the new
morphological features i.e., features based on number of runs of pixels in four
directions (N,NE,E,NW) are extracted, together with the GLCM based statistical
features and LDP features that are less sensitive to the noise and
non-monotonic illumination changes, are extracted from the significant blocks
of the gradient image. Then these features are concatenated together. We
integrate the above mentioned methods to take full advantage of the three
approaches. Extraction of the significant blocks from the absolute gradient
image and hence from the original image to extract pertinent information with
the idea of dimension reduction forms the basis of the work. The efficiency of
our method is demonstrated by the experiment on 1100 images from the FRAV2D
face database, 2200 images from the FERET database, where the images vary in
pose, expression, illumination and scale and 400 images from the ORL face
database, where the images slightly vary in pose. Our method has shown 90.3%,
93% and 98.75% recognition accuracy for the FRAV2D, FERET and the ORL database
respectively.Comment: 7 pages, Science Academy Publisher, United Kingdo
High Performance Human Face Recognition using Gabor based Pseudo Hidden Markov Model
This paper introduces a novel methodology that combines the multi-resolution
feature of the Gabor wavelet transformation (GWT) with the local interactions
of the facial structures expressed through the Pseudo Hidden Markov model
(PHMM). Unlike the traditional zigzag scanning method for feature extraction a
continuous scanning method from top-left corner to right then top-down and
right to left and so on until right-bottom of the image i.e. a spiral scanning
technique has been proposed for better feature selection. Unlike traditional
HMMs, the proposed PHMM does not perform the state conditional independence of
the visible observation sequence assumption. This is achieved via the concept
of local structures introduced by the PHMM used to extract facial bands and
automatically select the most informative features of a face image. Thus, the
long-range dependency problem inherent to traditional HMMs has been drastically
reduced. Again with the use of most informative pixels rather than the whole
image makes the proposed method reasonably faster for face recognition. This
method has been successfully tested on frontal face images from the ORL, FRAV2D
and FERET face databases where the images vary in pose, illumination,
expression, and scale. The FERET data set contains 2200 frontal face images of
200 subjects, while the FRAV2D data set consists of 1100 images of 100 subjects
and the full ORL database is considered. The results reported in this
application are far better than the recent and most referred systems.Comment: 9 pages. arXiv admin note: substantial text overlap with
arXiv:1312.151
Face Synthesis (FASY) System for Generation of a Face Image from Human Description
This paper aims at generating a new face based on the human like description
using a new concept. The FASY (FAce SYnthesis) System is a Face Database
Retrieval and new Face generation System that is under development. One of its
main features is the generation of the requested face when it is not found in
the existing database, which allows a continuous growing of the database also
Handwritten Bangla Basic and Compound character recognition using MLP and SVM classifier
A novel approach for recognition of handwritten compound Bangla characters,
along with the Basic characters of Bangla alphabet, is presented here. Compared
to English like Roman script, one of the major stumbling blocks in Optical
Character Recognition (OCR) of handwritten Bangla script is the large number of
complex shaped character classes of Bangla alphabet. In addition to 50 basic
character classes, there are nearly 160 complex shaped compound character
classes in Bangla alphabet. Dealing with such a large varieties of handwritten
characters with a suitably designed feature set is a challenging problem.
Uncertainty and imprecision are inherent in handwritten script. Moreover, such
a large varieties of complex shaped characters, some of which have close
resemblance, makes the problem of OCR of handwritten Bangla characters more
difficult. Considering the complexity of the problem, the present approach
makes an attempt to identify compound character classes from most frequently to
less frequently occurred ones, i.e., in order of importance. This is to develop
a frame work for incrementally increasing the number of learned classes of
compound characters from more frequently occurred ones to less frequently
occurred ones along with Basic characters. On experimentation, the technique is
observed produce an average recognition rate of 79.25 after three fold cross
validation of data with future scope of improvement and extension
A Face Recognition approach based on entropy estimate of the nonlinear DCT features in the Logarithm Domain together with Kernel Entropy Component Analysis
This paper exploits the feature extraction capabilities of the discrete
cosine transform (DCT) together with an illumination normalization approach in
the logarithm domain that increase its robustness to variations in facial
geometry and illumination. Secondly in the same domain the entropy measures are
applied on the DCT coefficients so that maximum entropy preserving pixels can
be extracted as the feature vector. Thus the informative features of a face can
be extracted in a low dimensional space. Finally, the kernel entropy component
analysis (KECA) with an extension of arc cosine kernels is applied on the
extracted DCT coefficients that contribute most to the entropy estimate to
obtain only those real kernel ECA eigenvectors that are associated with
eigenvalues having high positive entropy contribution. The resulting system was
successfully tested on real image sequences and is robust to significant
partial occlusion and illumination changes, validated with the experiments on
the FERET, AR, FRAV2D and ORL face databases. Experimental comparison is
demonstrated to prove the superiority of the proposed approach in respect to
recognition accuracy. Using specificity and sensitivity we find that the best
is achieved when Renyi entropy is applied on the DCT coefficients. Extensive
experimental comparison is demonstrated to prove the superiority of the
proposed approach in respect to recognition accuracy. Moreover, the proposed
approach is very simple, computationally fast and can be implemented in any
real-time face recognition system.Comment: 9 pages,Published Online August 2013 in MECS. International Journal
of Information Technology and Computer Science, 2013. arXiv admin note: text
overlap with arXiv:1112.3712 by other author
Text Region Extraction from Business Card Images for Mobile Devices
Designing a Business Card Reader (BCR) for mobile devices is a challenge to
the researchers because of huge deformation in acquired images, multiplicity in
nature of the business cards and most importantly the computational constraints
of the mobile devices. This paper presents a text extraction method designed in
our work towards developing a BCR for mobile devices. At first, the background
of a camera captured image is eliminated at a coarse level. Then, various rule
based techniques are applied on the Connected Components (CC) to filter out the
noises and picture regions. The CCs identified as text are then binarized using
an adaptive but light-weight binarization technique. Experiments show that the
text extraction accuracy is around 98% for a wide range of resolutions with
varying computation time and memory requirements. The optimum performance is
achieved for the images of resolution 1024x768 pixels with text extraction
accuracy of 98.54% and, space and time requirements as 1.1 MB and 0.16 seconds
respectively.Comment: Proc. of International Conference on Information Technology and
Business Intelligence (ITBI-09), pp.227-235, Nov 6-8, 2009, Nagpur, Indi